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1.0  Executive  Summary 


Ultrasonic  imaging  is  the  method  of  choice  for  noninvasive  examination  of  tissue  and  real-time 
evaluation  of  blood  flow.  It  is  also  capable  of  imaging  foreign  bodies,  such  as  radiolucent  shrapnel, 
especially  in  the  abdomen  and  chest.  These  functions  would  significantly  aid  a  combat  medic  operating  in 
the  zone  of  close  combat  (or  a  civilian  counterpart  operating  at  an  accident  site)  if  a  suitable  backpackable 
instrument  existed. 

A  backpackable  ultrasound  array  imaging  system  must  be  compact,  lightweight,  electrically 
efficient  and  mechanically  rugged,  yet  provide  excellent  image  quality  and  color  flow  sensitivity.  Under 
this  SBIR  program,  we  investigated  a  sampled  analog  beamformer  to  dramatically  reduce  the  size,  weight 
and  power  consumption  of  a  portable  array  imaging  system.  The  key  technical  issue  is  to  develop  a 
specific  beamforming  architecture  approximating  the  image  quality  and  flow  sensitivity  of  current 
commercial  systems. 

Clinically,  the  system  must  be  able  to  find  and  assess  abdominal  injury,  especially  free  fluid. 

In  addition,  it  should  be  useful  for  other  trauma  applications,  as  well  as  possible  obstetrics  and 
gynecological  examinations  using  an  endovaginal  probe.  Sensitive  color  flow  images  and  duplex  Doppler 
measurements  are  needed  to  differentiate  soft  tissue  from  blood  pools  and  identify  ruptured  blood  vessels. 
The  quality  of  such  measurements  is  directly  related  to  the  instantaneous  dynamic  range  of  the  beamformer. 
Consequently,  the  system  must  provide  an  instantaneous  dynamic  range  comparable  to  the  current 
commercial  state  of  the  art. 

Phase  I  of  this  project: 

♦  defined  a  set  of  requirements  for  the  beamformer  that  will  yield  a  system  with  performance 
comparable  to  the  current  commercial  state  of  the  art, 

♦  developed  a  manufacturable  CCD  delay  line  that  will  meet  these  requirements, 

♦  fabricated  critical  elements  of  the  CCD  delay  line,  and 

♦  developed  a  control  algorithm  that  will  maintain  a  tight  focus  at  all  image  depths. 

Simulations  of  the  Phase  I  design  predicts: 

♦  power  consumption  will  be  10%  of  conventional  digital  techniques,  which  makes  portable, 
battery  powered  operation  feasible, 

♦  dynamic  range  will  be  adequate  for  sensitive  color  flow  images  and  duplex  Doppler 
measurements,  and 

♦  the  charge  transfer  efficiency  of  the  CCDs  will  have  negligible  effect  on  image  quality. 

Phase  II  of  this  project  will  yield  a  full  beamformer  chip  suitable  for  use  in  a  portable,  battery 
powered  ultrasound  imager.  This  phase  includes  design,  two  passes  at  fabrication,  and  test.  Under  an 
optional  demonstration  phase,  a  set  of  Phase  II  beamformer  chips  will  be  configured  into  a  full  beamformer 
and  coupled  to  an  actual  transducer  probe  to  capture  data  in  real  time,  for  off-line  processing  and  display. 

A  complete  backpackable  ultrasound  imaging  system  based  on  the  beamformer  chips  will  be 
developed  as  a  Phase  III  effort. 


QOOTinq 


6 

Proprietary  SBIR  Data;  Q-DOT,  Inc.,  Colorado  Springs,  CO;  Contract  DAMD17-96-C-6037 


2.0  Introduction 


Ultrasonic  imaging  is  the  method  of  choice  for  noninvasive  examination  of  tissue  and  real-time 
evaluation  of  blood  flow.  It  is  also  capable  of  imaging  foreign  bodies,  such  as  radiolucent  shrapnel, 
especially  in  the  abdomen  and  chest.  These  functions  would  significantly  aide  a  combat  medic  operating 
in  the  zone  of  close  combat  (or  a  civilian  counterpart  operating  at  an  accident  site)  if  a  suitable  instrument 
existed.  However,  present  diagnostic  ultrasound  scanners  are  large,  heavy,  require  hundreds  of  watts  of 
power,  and  are  not  particularly  amenable  to  portable  operation.  (One  new  scanner  draws  so  much  power 
that  it  cannot  be  plugged  into  a  conventional  wall  socket!)  Figure  1  diagrams  a  conventional  ultrasonic 
imaging  system.  This  real-time  phased-array  system  typically  employs  a  linear  or  curved  array  of 
piezoelectric  transducers  operating  in  a  frequency  range  of  roughly  1  MHz  to  10  MHz,  depending  on  the 
penetration  depth  required  and  the  desired  resolution.  Typical  modem  front-end  processor  circuitry 
comprises  predominantly  digital  circuitry,  as  shown  in  Figure  2.  Transmission  waveforms  are  usually 
developed  as  square  waves  in  the  beamformer  which  are  subsequently  filtered  or  smoothed  into 
approximately  sinusoidal  waveforms  by  the  transducer  itself  and  its  driver.  Alternatively,  the  transmission 
waveform  is  a  series  of  digital  words  which  are  converted  to  an  analog  signal  by  a  digital-to-analog 
converter  (DAC)  prior  to  driving  the  transducer.  The  transducer  converts  the  electronic  signal  to  an 
ultrasound  waveform  which  is  directed  into  the  body  under  observation.  Body  tissue  reflects  the  ultrasound 
wave  which  the  transducer  converts  back  to  an  electric  signal.  This  signal  is  amplified  and  scaled  with  a 
time-gain  compensation  amplifier  (TGC).  The  TGC  compensates  for  depth-dependent  attenuation  of  the 
ultrasound  signal  in  tissue.  Next  the  received  signal  is  converted  to  a  series  of  digital  words  with  an 
analog-to-digital  converter  (ADC)  prior  to  beamforming. 

A  typical  digital  beamformer  performs  in-line  data  storage,  dynamic  focusing,  steering,  and 
apodization  functions  digitally.  It  comprises  random  access  memory  (RAM),  digital  signal  processors 
(DSPs),  microprocessors,  and  other  digital  logic  components.  Both  the  transmitted  and  received 
waveforms  are  focused,  steered,  and  apodized  to  support  confocal  operation.  Digital  beamforming  offers 
excellent  images  and  Doppler  flow  data  at  the  expense  of  size,  weight,  and  power  consumption  in  the 
processor.  An  individual  ADC  (and  DAC)  is  required  for  every  transducer  element.  Current  transducers 
usually  employ  32  to  192  elements.  Advanced  transducers  may  employ  more  than  1,000  elements  to 
enhance  resolution  and  reduce  phase  aberration.  These  digital  waveforms  must  be  individually  delayed, 
apodized  (by  multiplying  with  a  weighting  function),  and  summed.  For  a  typical  128-element  system  these 
functions  require  more  than  30  billion  mathematical  operations  per  second!  Fortunately,  the  functions 
readily  map  into  parallel  processors.  However,  the  combined  size  and  power  is  formidable,  especially 
when  considered  for  application  in  portable,  man-packed  equipment  for  use  by  combat  medics  in  the  zone 
of  close  combat. 

The  proposed  sampled-analog  beamformer  shown  in  Figure  3  provides  the  same  functionality 
as  the  typical  digital  processor  described  above,  but  at  roughly  10%  of  its  size,  weight,  10%  power 
consumption,  and  cost.  The  proposed  sampled-analog  front-end  beamformer  will  be  realized  in  high-speed 
charge-coupled  device  (CCD)  technology  combined  on  the  same  integrated  circuit  chip  with  analog  and 
digital  CMOS  devices.  While  this  effort  addressed  only  the  beamformer  functions,  in  the  future  the  entire 
front-end  processor  may  be  integrated  with  advanced  CCD/CMOS  technology.  It  will  even  be  possible  for 
the  entire  front-end  processor  to  be  moved  out  of  the  system  into  the  transducer  probe,  resulting  in 
additional  savings  in  size,  power  consumption,  and  cost. 
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Figure  1:  Ultrasonic  Imaging  System  Block  Diagram 


One  Set  Per  Transducer  Element 


One  Set  per  System 


Figure  3:  Proposed  Sampled  Analog  Front-End  Processor 
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3.0  Phase  I  Technical  Objectives 


The  major  objective  of  the  Phase  I  effort  is  to  assess  the  technical  feasibility  of  developing  a 
high-performance  beamformer  for  a  compact,  low-power,  low-cost,  ultrasonic  scanner  which  meets  both 
TRP  and  industry  requirements  for  use  by  combat  medics  in  the  zone  of  close  combat  and  by  civilian 
emergency  medical  technicians  (EMTs)  at  the  scene  of  an  emergency.  Specific  objectives  include: 

1)  Define  the  target  system  specification  with  TRP  technical  personnel  and  with  commercial 
medical  ultrasound  equipment  manufacturers  to  sharpen  Q-DOT's  focus  on  real-world 
requirements; 

2)  Synthesize  candidate  ultrasonic  beamformers  using  advanced  CCD/CMOS  technology 
to  meet  the  refined  specification; 

3)  Analyze  elements  of  the  candidate  beamformers  to  discover  their  performance  limitations; 

4)  Refine  the  candidate  beamformer  architectures  based  on  analysis  of  the  elements,  estimate 
the  performance  of  the  refined  beamformers,  and  review  results  with  TRP  and  industry 
technical  personnel; 

5)  Structure  a  work  plan  to  develop  the  resulting  beamformers;  and 

6)  Briefly  document  the  Phase  I  effort,  focusing  on  its  results. 
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4.0  Technical  Discussion 


A  backpackable  ultrasound  array  imaging  system  must  be  compact,  lightweight,  electrically 
efficient  and  mechanically  rugged,  yet  provide  excellent  image  quality  and  color  flow  sensitivity.  A 
conceptual  configuration  is  shown  in  Figure  4.  The  bulk  of  the  electronic  scanner  is  housed  in  a  backpack. 
Wireless  (RF)  links  connect  the  scanner  to  a  local  (e.g.,  50  foot  radius)  combat  medic  and  a  remote 
radiologist.  The  combat  medic  wears  a  pod  the  size  of  a  package  of  cigarettes  (on  his  forearm  or  thigh). 
The  pod  is  linked  via  cable  to  the  hand-held  probe  and  via  a  wireless  link  to  the  backpack.  The  pod 
contains  the  beamformer,  a  color  display,  an  audio  link  to  the  radiologist,  and  batteries.  The  medic  can 
then  probe  independently  or  with  guidance  from  the  radiologist.  In  this  SBIR  program,  we  are  investigating 
a  sampled  analog  beamformer  to  dramatically  reduce  the  size,  weight  and  power  consumption  of  a  portable 
array  imaging  system.  The  key  technical  issue  is  to  develop  a  specific  beamforming  architecture 
approximating  the  image  quality  and  flow  sensitivity  of  current  commercial  systems. 


4.1  Requirements 

Clinically,  the  system  must  be  able  to  find  and  assess  abdominal  injury,  especially  free  fluid.  In 
addition,  it  should  be  useful  for  other  trauma  applications,  as  well  as  possible  obstetrics  and  gynecological 
examinations  using  an  endovaginal  probe.  Consequently,  the  system  should  be  able  to  image  with  both 
phased  arrays  (element  size  comparable  to  about  half  a  wavelength  at  the  center  frequency)  and 
convex/linear  arrays  (element  size  comparable  to  a  wavelength  at  the  center  frequency)  operating  over 
a  frequency  range  of  about  2.5-10  MHz.  Since  the  finest  spatial  resolution  theoretically  possible  is 
not  a  key  requirement  for  these  applications,  a  64-channel  beamformer  is  sufficient.  Relatively  large, 

1 -dimensional  convex  and  linear  arrays  can  image  with  this  channel  count,  yielding  spatial  and  contrast 
resolution  comparable  to  the  current  commercial  state  of  the  art.  For  phased  arrays,  the  spatial  resolution 
will  be  about  half  that  of  current  commercial  systems.  Since  phased  arrays  will  only  be  used  for  a  limited 
number  of  trauma  applications,  this  resolution  loss  will  be  acceptable  if  the  system  is  highly  portable. 

Sensitive  color  flow  images  and  duplex  Doppler  measurements  are  needed  to  differentiate  soft 
tissue  from  blood  pools  and  identify  ruptured  blood  vessels.  The  quality  of  such  measurements  is  directly 
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related  to  the  instantaneous  dynamic  range  of  the  beamformer.  Consequently,  the  system  must  provide  an 
instantaneous  dynamic  range  comparable  to  the  current  commercial  state  of  the  art. 

Given  these  requirements,  the  system  specification  presented  in  Figure  5  was  developed.  The 
system  must  operate  over  a  frequency  range  from  2.5  - 10  MHz.  The  beamformer  also  must  be  able  to 
generate  a  90-degree  sector  image  with  focusing  to  a  depth  of  500  A  to  accommodate  all  probe  types.  High 
quality  transmit  beamforming  is  assured  with  the  combination  of  amplitude  apodization  (accurate  to  1  part 
in  64)  and  precise  time  delay  accuracy  (A/32).  On  transmit,  signals  output  from  the  sampled  analog 
beamformer  can  be  used  to  drive  low  impedance  amplifiers  exciting  individual  array  elements.  A  very 
short  set-up  time  is  provided  so  that  high  PRF  Doppler  is  possible. 

On  receive,  full  dynamic  focusing  and  apodization  are  needed  to  maintain  the  contrast  and  spatial 
resolution  required  for  all  possible  applications.  Again,  both  amplitude  apodization  and  precise  time  delay 
accuracy  are  provided  for  optimal  beamforming.  A  sophisticated  control  system  is  also  required  to 
maintain  a  tight  focus  at  all  image  depths.  On  each  channel,  time  delays  can  be  independently  updated  once 
every  wavelength.  This  permits  a  fully  dynamic  aperture  with  an  ^number  as  low  as  unity.  Again,  a  very 
short  set-up  time  is  provided  for  the  receive  beamformer  so  that  high  PRF  Doppler  is  possible. 

For  a  64-channel  beamformer,  the  maximum  acoustic  dynamic  range  (ratio  of  the  mainbeam  strength 
to  average  sidelobe  levels)  is  about  72  dB.  This  means  the  instantaneous  electronic  dynamic  range  must  at 
least  equal  this  level  for  sensitive  Doppler  measurements.  The  target  for  dynamic  range,  therefore,  is 
specified  as  70  -  75  dB.  It  may  be  possible  to  achieve  an  80  dB  electronic  dynamic  range.  If  such  a  level 
can  be  reached,  then  color  flow  images,  as  well  as  conventional  real-time  B-Scans,  will  only  be  limited  by 
the  acoustics  of  the  imaging  system.  This  is  the  current  state  of  the  art  for  commercial  array  scanners. 

In  the  next  section,  the  specific  sampled  analog  architecture  designed  to  deliver  these  target 
specifications  is  described  in  detail. 
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4.2 


Beamformer 


4.2.1  Technical  Approach 

The  beamformer  architecture  diagrammed  in  Figure  6  is  the  preferred  candidate  for  implementation 
in  advanced  CMOS/CCD  technology.  Each  of  the  32  channels  of  the  beamformer  contains  the  same  circuit 
up  to  the  final  summation  (X)  and  partial  beam  output.  (This  output  will  be  summed  with  the  outputs  of 
other  beamformer  chips  to  form  a  complete  beam.)  After  the  signal  from  an  individual  transducer  element 
is  conditioned  by  its  preamp  and  TGC,  it  is  presented  to  an  input  channel  of  the  beamformer.  The  signal  is 
sampled  in  quadrature.  That  is,  samples  are  acquired  at  0°,  90°,  180°,  and  270°  of  the  sampling  clock.  In 
this  example,  the  sampling  clock  (fs)  operates  at  eight  times  the  center  frequency  (fs  =  8  f0),  or  each  sample 
spans  l/8th  of  a  center  frequency  wavelength  (A/8).  Quadrature  sampling  effectively  increases  the 
sampling  rate  to  32  f0  and  the  resolution  to  A/32.  Quad  samplers  cycle  through  their  four  samplers  in  the 
same  order  but  with  different  initial  conditions.  Thus,  A/32  resolution  is  maintained  among  all  channels  in 
the  beamformer  as  well  as  among  all  beamformer  chips  in  the  system.  Individual  samplers  will  be  realized 
with  Q-DOT's  proven  diode-cutoff  sampler. 


With  time  resolution  established  in  the  quad  sampler,  groups  of  three  samples  may  be  summed 
for  the  remaining  operations.  This  reduces  the  number  of  delay  stages  required  to  span  56A  by  a  factor  of 
four.  Since  the  in-line  delay  circuitry  dominates  the  beamformer,  and  since  the  delay  stages  are  nearly 
minimum  size,  total  chip  area  is  reduced  by  a  factor  of  at  least  three.  Since  the  number  of  acceptable  chips 
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per  wafer  varies  approximately  inversely  as  the  square  of  chip  area,  the  yield  of  acceptable  chips  per  wafer 
is  increased  by  about  a  factor  of  10  (with  corresponding  cost  reduction).  The  reduction  in  sampling  clock 
rate  from  32  fs  to  8  fs  combined  with  the  reduction  in  delay  stages  reduces  power  dissipation  in  the  in-line 
delay  by  a  factor  of  16.  Power  dissipation  in  the  apodizer,  output  summation,  and  output  amplifier  is 
reduced  by  a  factor  of  four,  benefiting  only  from  the  change  in  sampling  clock  rate. 

The  in-line  delay  comprises  a  series  of  nine,  binary- weighted  stages.  Each  stage  can  add  twice  as 
much  net  delay  as  the  previous  stage.  The  first  stage  can  insert  a  delay  of  zero  or  A/8,  the  second  stage 
zero  or  A/4,  etc.,  to  the  last  stage  which  can  insert  zero  or  32A.  The  composite  delay  ranges  from  zero  to 
63.875A  in  A/8  steps.  The  composite  delay  can  be  dynamically  incremented  or  decremented  in  A/8  steps  by 
incrementing  its  control  counter  while  maintaining  a  monotonic  time  sequence  of  signal  samples.  (Details 
are  presented  in  Section  4.2.2.  This  single  in-line  delay  section,  together  with  the  quad  sampler,  is  used  for 
both  beam  steering  and  dynamic  focusing. 

After  it  is  delayed  appropriately,  each  signal  is  apodized  by  attenuating  its  amplitude.  The 
apodizer  provides  64  gain  ranges  spanning  a  minimum  gain  of  zero  to  a  maximum  gain  of  63/64  in  1/64 
increments.  Apodization  can  be  dynamically  changed  in  single  level  steps  by  incrementing  its  6-bit  control 
counter.  Apodization  is  performed  with  a  charge-mode  multiplying  digital-to-analog  converter  (MDAC) 
which  Q-DOT  is  currently  employing  in  several  developmental  circuits. 

After  apodization,  charge  samples  from  all  32  channels  on  the  chip  are  summed  and  converted 
to  an  equivalent  output  current  by  a  CMOS  amplifier.  The  synchronous,  sampled  nature  of  the  data  is 
maintained  in  this  process  resulting  in  a  stepped  (zero  order  hold)  signal.  Current  outputs  from  all 
beamformer  chips  are  readily  summed  with  an  operational  amplifier  (not  shown). 

The  beamformer  is  programmed  via  individual  serial  ports  for  each  channel.  Initial  conditions  for 
the  quad  sampler  shift  register  (SR),  the  in-line  delay  counter,  and  the  apodization  counter  are  loaded  prior 
to  each  ultrasound  pulse. 

The  previous  discussion  addresses  the  beamformer  as  it  is  configured  to  receive  the  reflected 
ultrasound  pulse.  The  same  fundamental  components  are  used  for  transmit  beamforming.  For 
transmission,  all  the  inputs  are  tied  together  and  driven  from  an  on-chip  waveform  generator  (not  shown). 
After  apodization,  outputs  are  switched  from  the  common  summation  circuitry  to  individual  output  buffers 
(not  shown).  The  same  physical  chip  can  perform  both  transmit  and  receive  functions,  but  are  typically 
dedicated  to  one  use  in  a  system.  It  is  also  possible  to  design  a  beamformer  to  operate  bidirectionally, 
effectively  providing  both  functions  on  the  same  chip. 

4.2.2  Complementary  Delay  Line 

The  key  to  the  beamformer's  small  size,  low  power,  ease  of  programming,  and  monotonic  time 
sequence  is  the  innovative  complementary  delay  line.  As  will  become  apparent,  a  key  element  in  the 
complementary  delay  line  is  a  crossbar  switch.  Shown  in  Figure  7,  the  crossbar  switch  has  two  states.  In 
the  straight  path  state,  its  two  inputs  are  connected  in  parallel  to  its  two  outputs.  In  the  crossed  path  state, 
the  two  input  signals  are  crossed  before  they  are  individually  applied  to  the  two  outputs.  No  other  states 
are  permitted.  The  symbol  for  a  crossbar  switch  is  also  shown. 
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A  three-stage  complementary  delay  line  is  shown  in  schematic  form  in  Figure  8.  The  input  signal 
is  split  into  two  equal  parts  which  are  delayed  equally  to  the  first  stage.  One  signal  is  routed  to  the  short 
path  (1A)  and  the  other  to  the  long  path  (IB).  Path  1A  is  At  long,  while  IB  is  2At  long,  resulting  in  a 
Stage  1  net  delay  of  At.  The  Stage  1  signals  are  next  routed  to  the  Stage  2  delays,  2A  and  2B,  via  a 
crossbar  switch.  The  choices  are  (1A  — >  2A  and  IB  — >  2B)  or  (1A  — >  2B  and  IB  — >2A).  Stage  2  provides 
delays  of  At  and  3At,  resulting  in  a  net  delay  of  2At.  Similarly,  the  Stage  2  output  signals  are  linked  to 
Stage  3  via  another  crossbar  switch.  After  the  last  stage,  one  of  the  last  stage  signals  is  linked  to  the  output 
while  the  other  is  discarded. 
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Figure  8:  Complementary  Delay  Line 


Monotonicity  of  the  time  sequence  is  diagrammed  in  Figure  9.  For  convenience,  time  is  graphed 
in  discrete  steps  of  At.  The  oldest  time  increment  is  labeled  A,  the  next  oldest  B,  etc.  Both  paths  of  the 
complementary  in-line  delay  diagrammed  in  Figure  8  are  shown  as  branches  from  a  common  starting  point 
at  t0.  In  the  initial  condition  before  switching,  elements  1  A,  2B,  and  3A  are  in  the  output  path  while  IB, 
2A,  and  3B  are  in  the  complement  path.  The  total  delay  from  input  to  output  is  5At.  The  time  increments 
present  in  each  path  are  labeled.  Then  the  Stage  1  increments  are  swapped  to  increase  the  output  delay  by 
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At.  (The  swap  is  done  by  toggling  all  crossbar  switches.)  If  these  elements  are  swapped  instantaneously, 
they  will  carry  the  signal  present  in  them  before  the  swap  to  their  new  path.  The  result  of  this  instanta¬ 
neous  swap  is  shown  in  the  "After  Switching"  portion  of  Figure  9.  Note  that  an  extra  L  increment  has  been 
added  to  the  output  sequence  as  the  output  path  was  lengthened.  The  complement  path  is  missing  an  L 
increment.  However,  both  paths  remain  monotonic.  Monotonicity  is  maintained  for  all  possible  ±At 
switching.  The  impact  of  repeating  or  skipping  a  time  increment  is  minimized  by  keeping  At  relatively 
small,  in  this  case  A/8. 
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Figure  9:  Monotonic  Switching  of  Complementary  Delay  Line 


A  CCD/CMOS  version  of  this  three-stage  complementary  delay  line  is  sketched  in  Figure  10. 
CCD  technology  is  well  suited  to  the  complementary  delay  line  architecture  because  all  signals  reside  in 
discrete  charge  packets  and  move  synchronously  through  the  delay  line  one  data  cell  at  a  time.  Therefore, 
the  crossbar  switching  is  both  instantaneous  and  lossless.  Delay  elements  corresponding  to  the  A  and  B 
paths  are  shaded.  Paths  through  the  three  crossbar  switches  are  indicated  by  distinctive  arrows. 
Nonprogrammable  charge  packet  movement  is  indicated  with  short  arrows.  The  input  charge  is  split 
precisely  by  implementing  precision  splitting  techniques  developed  at  Q-DOT.  The  innovative  crossbar 
switch  was  developed  under  this  program. 

4.2.3  Crossbar  Switch  Functional  Description 

The  crossbar  switch  is  the  key  element  in  the  complementary  delay  line.  The  crossbar  switch 
simultaneously  accepts  two  charge  packets,  and  under  digital  control,  either  passes  the  two  packets  in 
parallel  to  a  pair  of  outputs  (straight  mode),  or  it  swaps  the  two  packets  at  the  output  (cross  mode). 
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The  selected  structure  for  the  CCD  implementation  of  the  crossbar  switch  is  shown  in  Figure  11a, 
where  the  CCD  gate  outlines  are  labeled  with  the  name  of  a  clocking  signal  connected  to  the  gate.  The  six 
gates  on  the  right  are  each  connected  through  a  CMOS  multiplexer  (switch)  to  either  of  the  two  clocks 
indicated.  The  effect  of  changing  the  switch  position  is  to  swap  the  upper  and  lower  clocks.  The  clocks 
applied  to  the  crossbar  switch  are  shown  in  Figure  lib. 

The  gates  outlined  in  Figure  1  la  are  conductors  placed  on  the  top  surface  of  the  silicon,  with  a 
thin  layer  of  insulator  separating  the  gate  from  the  silicon.  Charge  in  the  form  of  electrons  is  free  to  move 
within  an  area  known  as  a  channel  which  lies  under  the  CCD  gates.  When  a  gate  is  in  the  on  state 
(positive),  the  negative  mobile  charge  in  the  channel  is  attracted  to  the  gate  and  is  held  in  a  potential  well 
immediately  underneath  the  gate.  If  two  adjacent  gates  are  both  on,  and  the  gates  adjacent  to  that  pair  are 
off,  the  potential  well  will  be  under  both  on  gates,  and  the  charge  packet  will  be  coupled  under  both  gates. 
The  packet  of  charge  held  in  the  joint  potential  well  is  passed  along  a  channel  by  manipulating  the  clock 
voltages  applied  to  the  gates.  The  packet  of  charge  is  moved  forward  by  turning  the  upstream  gate  off, 
while  simultaneously  turning  the  downstream  gate  on,  which  has  the  net  effect  of  moving  the  charge 
forward  by  one  gate  position,  where  it  will  be  held  until  the  next  clocking  event.  A  structure  known  as  a 
diode-cutoff  sampler  (not  shown)  generates  a  packet  of  charge  proportional  to  the  input  voltage  at  a  sample 
time,  and  a  floating  diffusion  sense  circuit  (not  shown)  is  used  to  produce  a  voltage  proportional  to  the  size 
of  the  charge  packet.  This  ability  to  store  and  move  signal  samples  provides  a  means  of  delaying  the  signal 
samples  in  time,  which  makes  the  CCD  technology  ideal  for  programmable  delay  lines  in  the  beamformer. 
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Figure  11:  Crossbar  Switch 

A  cartoon  illustrating  the  operation  of  the  crossbar  switch  in  the  straight  mode  is  shown  in 
Figure  12.  The  series  of  diagrams  show  an  off  gate  in  white,  and  the  on  gates  in  gray,  for  each  clock  state 
in  Figure  lib.  At  the  start,  Step  0,  the  two  charge  packets  labeled  X  and  Y  are  in  similar  positions  on  the 
input  side  of  the  crossbar  switch.  At  Step  1 ,  the  Y  packet  begins  to  advance  while  the  X  packet  is  held 
stationary.  Y  continues  to  advance  while  X  is  stationary  through  Step  4.  On  steps  5  and  6,  both  packets 
advance.  For  steps  7  through  10,  the  Y  packet  is  stationary,  while  X  advances.  At  Step  10,  the  two 
packets  are  again  aligned,  and  are  passed  out  of  this  structure  on  Step  1 1 .  The  two  charge  packets  have 
been  passed  straight  through  the  structure,  with  the  X  and  Y  retaining  their  relative  positions  top  and 
bottom  as  they  passed  through  the  structure. 
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Figure  12:  Crossbar  Switch  in  Straight  Mode 


The  cross  mode  is  shown  in  Figure  13.  Operation  is  identical  to  the  straight  mode  for  Step  0 
through  Step  3,  where  the  leading  Y  packet  has  reached  the  end  of  the  central  electrodes.  On  Step  4,  the 
Y  packet  is  directed  to  the  top,  and  on  Step  8,  the  X  packet  is  directed  to  the  bottom,  so  that  at  the  end  Y  is 
on  top,  and  X  is  on  the  bottom.  The  two  charge  packets  have  crossed  as  they  passed  through  the  structure, 
with  the  X  and  Y  packets  exchanging  their  relative  positions. 

The  clocks  repeat  in  an  eight  step  cycle,  so  at  Step  8  in  both  Figures  12  and  13  the  structure  will 
receive  the  next  pair  of  charge  packets.  The  delay  through  the  structure  shown  takes  12  steps,  which  is 
1.5  clock  cycles.  Additional  gates  (not  shown)  increase  the  delay  to  an  even  number  of  clock  cycles. 

The  clock  multiplexers  that  set  the  crossbar  mode  may  be  switched  whenever  the  pairs  of  clocks  to 
be  exchanged  are  in  the  same  state  in  either  mode.  This  eliminates  the  possibility  of  switching  transients 
due  to  clock  skew  from  disrupting  the  charge.  This  condition  occurs  at  steps  2  and  3. 
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Figure  13:  CCD  Crossbar  Switch  in  Cross  Mode 


4.2.4  Quad  Sampler 

The  beamformer  must  maintain  delay  resolution  of  A/32  for  a  center  frequency  up  to  10  MHz. 

The  delay  lines  described  in  the  preceding  sections  could  be  operated  at  a  320  MHz  clock  rate  to  achieve 
these  goals,  but  this  would  make  the  delay  lines  too  long  and  would  require  excessive  drive  power.  A  better 
approach  is  to  have  a  short,  dynamically  changeable,  delay  operating  at  the  high  frequency  followed  by  a 
decimator  to  reduce  the  data  rate  for  the  remaining  delay  lines.  This  will  greatly  reduce  the  clock  drive 
power,  which  is  the  single  largest  power  consumer  in  the  beamformer.  For  example,  if  the  data  rate  is 
reduced  by  a  factor  of  four,  the  clock  rate  and  the  number  of  stages  in  the  following  delays  will  both  be 
reduced  by  a  factor  of  four.  Since  the  clock  drive  power  is  proportional  to  both  these  parameters,  the 
power  will  be  reduced  by  a  factor  of  nearly  16.  (The  gates  will  have  to  be  slightly  larger  to  maintain  the 
dynamic  range,  so  a  full  factor  of  16  will  not  be  realized.) 
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A  decimation  factor  of  four  was  chosen  for  this  study.  The  delay  lines  will  operate  at  eight  times 
the  center  frequency,  with  a  Nyquist  rate  at  four  times  the  center  frequency.  The  decimation  factor  could 
possibly  be  increased  to  eight,  yielding  a  Nyquist  rate  of  two  times  the  center  frequency.  The  impact  of 
increasing  the  decimation  factor  will  be  assessed  in  Phase  II. 

The  preferred  implementation  of  a  quad  sampler  is  shown  in  Figure  14.  A  single  sampling 
voltage-to-charge  converter  operates  at  the  full  rate  of  320  MHz.  The  CCD  input  register  continuously 
shifts  the  input  stream  in  the  direction  shown.  The  four-phase  clock  uses  the  sequence  PI ,  P2,  P3,  P4.  At 
each  gate  marked  P4,  the  charge  will  be  passed  forward  or  diverted  into  the  output  register  by  clocking  the 
appropriate  gate  with  PI,  which  will  be  connected  to  the  gate  through  a  multiplexer  (not  shown).  In  this 
approach,  three  sequential  packets  will  be  diverted  and  summed  in  the  output  register.  The  remaining 
samples  will  be  kept  in  the  input  register,  shifted  out  and  discarded  to  the  sink. 


The  diversion  will  occur  every  four  cycles  of  the  input  register.  The  output  register  will  cycle  at 
the  diversion  rate.  The  group  of  three  stages  diverted  may  be  advanced  by  one  stage  at  any  time.  As  each 
group  of  three  stages  is  advanced,  the  group  delay  through  the  quad  sampler  is  increased  by  the  period  of 
one  cycle  of  the  input  register.  Then,  the  resolution  in  setting  the  delay  equals  the  period  of  the  input 
register  clock  cycle,  or  A/32.  After  the  fourth  advance,  the  diversion  is  wrapped  back  to  the  first  group, 
and  the  dynamic  delay  line  following  the  quad  sampler  is  incremented. 


This  structure  could  be  operated  with  a  single  sample  instead  of  summing  three  samples,  but  the 
area  of  the  gates  in  the  input  register  would  have  to  be  increased  a  factor  of  three  to  maintain  the  dynamic 
range.  The  resulting  gates  would  have  longer  transfer  lengths  which  would  seriously  degrade  the 
performance  of  delay  fine.  The  summation  of  the  three  samples  also  forms  a  low-pass  filter,  which 
suppresses  out-of-band  noise  which  may  be  aliased  into  the  passband  upon  decimation. 


The  operation  of  the  quad  sampler  is  shown  in  the  cartoon  of  Figure  15.  At  the  start  of  the  cartoon 
(Step  0),  the  first  three  charge  samples,  Ql,  Q2,  and  Q3,  have  already  been  clocked  into  the  input  register. 
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Figure  15:  Operation  of  Quad  Sampler 
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At  Step  1,  the  first  three  diverters  are  activated.  Three  packets  flow  onto  a  common  gate  at  the  beginning 
of  the  output  register  where  they  merge  into  a  single  packet  equal  to  Q1+Q2+Q3.  Any  charge  that  may 
have  been  present  in  the  input  register  ahead  of  the  diverted  packets  will  be  passed  forward  and  eventually 
discarded  in  the  sink. 

At  Step  3,  the  next  input  sample,  Q4,  is  placed  at  the  beginning  of  the  input  register.  The  next 
three  stages,  marked  with  an  E,  are  empty,  since  the  charge  that  otherwise  would  be  located  there  was 
diverted  into  the  output  register.  Charge  samples  continue  to  be  passed  into  the  input  register,  so  that  by 
Step  16  the  register  contains  samples  Q4  through  Q7  positioned  for  the  next  diversion.  At  Step  17  the  next 
diversion  occurs.  Figure  15  illustrates  the  case  where  the  delay  is  advanced.  The  second,  third,  and  fourth 
diverters  are  activated  so  that  packets  Q4  through  Q6  are  diverted,  summed,  and  passed  to  the  output 
register. 


Note  that  if  the  delay  hadn't  been  advanced,  Step  17  would  have  resembled  Step  1,  and  samples 
Q5,  Q6,  and  Q7  would  be  summed.  Also,  note  that  if  four  instead  of  three  samples  were  summed,  the  sum 
at  Step  17  would  only  have  three  packets  (Q6+Q5+Q4),  since  the  fourth  packet  needed  (Q3)  was  diverted 
earlier  at  Step  1. 

This  approach  has  several  advantages  over  other  approaches  considered.  The  primary  advantage 
is  that  there  is  only  one  sampler.  Other  structures  might  have  four  samplers,  where  gain  or  offset 
mismatches  between  the  samplers  would  show  up  as  a  fixed  pattern  noise  in  the  image.  Another  advantage 
is  that  there  are  no  problems  with  "holes"  in  the  data  as  the  delay  is  advanced.  Hie  third  advantage  is  that 
the  high  speed  section  can  be  implemented  with  smaller  gates,  which  improves  the  transfer  performance  of 
the  register. 


4.2.5  Delay  Line  Control 

The  complementary  delay  line  proposed  here  presents  several  complicated  control  issues  that  have 
been  resolved  during  Phase  I.  The  two  primary  difficulties  are:  1)  controlling  the  crossbar  switches  to 
achieve  the  appropriate  delays,  and  2)  switching  the  structure  so  the  sample  stream  is  delayed  correctly. 

As  described  below,  a  simple  gray  code  counter  solves  the  first  problem,  where  each  bit  controls  a  crossbar 
switch.  By  incrementing  this  counter,  the  correct  delay  through  the  structure  is  selected.  The  second 
problem  is  much  more  complex  and  requires  analysis  of  the  position  of  quad  sampler  outputs  as  they 
progress  through  the  complementary  delay.  The  crux  of  the  problem  is  that  when  the  quad  sampler  wraps 
around  from  its  largest  to  smallest  delay,  the  complementary  delay  fine  should  increment  its  delay  by  one. 
However,  the  samples  containing  a  wrap  take  time  to  progress  to  the  crossbar  that  will  repeat  one  of  those 
samples.  As  a  result,  a  variable  delay  must  be  imposed  between  a  quad  sampler  wrap  and  the  delay  change 
within  the  complementary  delay  line. 

The  crossbar  switches  within  the  delay  line  are  controlled  by  a  gray  code  sequencer.  Each  bit 
controls  one  switch.  The  control  sequences  for  a  2-,  3-,  5-,  and  9-element  delay  structure  are  given  in 
Figure  16  below.  The  generic  structure  shows  that  the  crossbars  may  operate  in  straight  or  crossed 
orientations.  The  specific  examples  given  in  Figure  16  of  10  and  15  sample  delays  are  highlighted  lightly 
in  the  delay  structure.  Changing  from  10  (=l+3+5+l)  to  ll(=2+3+5+l)  involves  changing  the  left-most 
crossbar  (highlighted  darker).  Similarly,  changing  from  15  to  16  involves  changing  the  third  crossbar  so 
the  new  delay  components  are  1+1+5+9=16.  Increasing  the  number  of  crossbars  requires  proportionally 
increasing  the  length  of  the  gray  code  control  sequence. 
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Figure  16:  Control  Sequences 

The  quad  sampler  at  the  input  to  the  delay  line,  as  described  previously,  reduces  the  clock  frequency 
and  power  consumption  of  the  complementary  delay  line.  It  introduces  a  difficult  control  issue,  however, 
since  switching  the  quad  sample  delays  must  be  synchronized  with  delay  line  switching.  The  last  sample 
from  the  270°  leg  must  be  repeated  in  the  complementary  delay  line  (a  +360°  change)  to  compensate  for  the 
270°  delay  change  of  the  quad  sampler.  Referring  to  the  lower  example  in  Figure  16,  if  a  sample  that  needs 
to  be  repeated  enters  the  structure  at  the  left,  it  will  take  exactly  seven  cycles  before  it  reaches  the  crossbar 
switch  (the  third  one  in  this  case)  that  repeats  that  sample  and  changes  the  delay  from  15  to  1 6  (A/8).  The 
delay  from  quad  sampler  wrapping  to  delay  line  switching  depends  exclusively  upon  which  crossbar  in  the 
structure  will  switch  to  change  the  delay.  Figure  17  presents  these  delays,  termed  "geographical  offsets" 
because  they  are  cycle  offsets  depending  on  the  geographic  location  of  the  crossbars  within  the  structure. 


Crossbar  to  Switch  Geographical  Offsets 


7  =  (4*2)- 1 


12  =  (7*2)-2 


21  =  (12*2)-3 


38  =  (21*2)-4 


71  =  (38*2)-5 


136  =  (71*2)-6 


265  =  (136*2)-7 


Figure  17:  Geographical  Offsets 


Controlling  the  switching  time  of  the  complementary  delay  line  to  within  one  cycle  requires 
specialized  circuitry  that  tracks  how  many  cycles  have  elapsed  since  the  quad  sampler  wrapped.  Ideally, 
this  can  be  done  with  two  counters:  one  gray  code  counter  provides  the  control  bits  to  the  delay  line,  and 
tire  other  counter  counts  down  the  geographic  offsets  before  applying  this  new  gray  code  to  the  delay  line. 
When  the  quad  sampler  wraps  around,  an  offset  (5 1 1-N)  is  loaded  into  the  counter.  Simultaneously,  the 
gray  code  counter  increments;  however,  a  register  withholds  this  change  from  the  delay  line.  When  the 
terminal  count  (511)  has  been  reached,  an  overflow  signal  stops  the  binary  counter  and  allows  the  new  gray 
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code  to  be  latched  into  the  register  driving  the  delay  control  lines.  The  gray  code  counter  provides  the 
offsets  with  the  toggle  control  bits  generated  within  the  counter  to  indicate  which  bit  will  change  on  the  next 
clock.  Only  one  of  the  nine  T  outputs  will  be  high,  since  only  one  bit  changes  for  each  delay  control  code. 

The  control  scheme  presented  above  would  work  well  if  the  frequency  of  delay  changes  were  very 
low  relative  to  the  frequency  that  samples  move  through  the  structure.  In  that  case,  repeated  samples 
would  always  have  a  chance  to  propagate  through  the  structure  before  another  sample  must  be  repeated. 
Here,  however,  the  delay  clock  (that  moves  the  samples  along)  is  8fo  and  the  fastest  the  quad  sampler  will 
wrap  is  fo/4).  In  the  worst  case,  therefore,  input  samples  must  be  repeated  (i.e.,  the  delay  must  change) 
every  32  samples.  If  one  of  the  larger  offsets  is  being  counted  (e.g.,  38, 71, 136,  or  265)  at  least  one  other 
offset  will  need  to  be  considered  simultaneously.  A  second  counter  is  needed  to  accommodate  the  situation 
where  a  large  offset  is  followed  closely  by  another  smaller  one.  The  modified  control  circuitry  is  presented 
below  in  Figure  18. 


Figure  18:  Modified  Control  Circuitry 

When  a  bit  in  the  control  code  to  the  delay  line  changes,  the  sample  at  the  appropriate  crossbar  is 
repeated.  For  the  case  where  there  are  two  offsets  being  counted  simultaneously,  the  control  code  for  the 
delay  line  deviates  from  the  normal  gray  code  sequence.  Essentially,  the  bits  representing  the  offset  must 
not  change  until  that  offset  has  expired.  The  registers  in  the  code  modifier  are  disabled  when  die 
corresponding  offset  is  being  calculated  (i.e.,  the  counter  is  running).  Therefore,  the  new  piece(s)  of  the 
gray  code  from  the  counter  is(are)  held  off  from  the  delay  line  until  the  appropriate  binary  counter  reaches 
its  terminal  count  and  enables  the  code  modifier  registers)  to  load.  Note  that  simultaneous  expiration  of 
the  offset  counters  is  not  a  problem  here;  however,  two  crossbars  will  change  in  the  same  cycle  to  correcdy 
delay  the  sample  stream. 
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The  bump  clock  generator  shown  in  Figure  19  is  a  simple  computational  engine  that  creates  a  pulse 
that  follows  the  1/R  relation  for  maintaining  an  in-focus  dynamic  receive  beam.  The  delays  can  be  changed 
as  quickly  as  once  every  range  clock  cycle,  which  is  set  by  a  minimum //number  criterion  (2  in  this  case). 
The  bump  clock  generator  solves  the  equation: 

a  __  no+Afl 
no  —m—  1’ 

where  A  is  the  number  of  range  clocks  between  delay  changes,  n0  is  the  number  of  range  clock  cycles  from 
the  start  of  a  beam  to  when  the  element  turns  on  (for  apodization),  An  is  the  number  of  range  clock  ticks 
since  turn  on,  and  m  is  the  delay  change  index.  The  constant  n0  is  calculated  from  various  operating 
parameters  such  as  sample  rate,  range  clock  rate,  element  position,  and  steering  angle.  A  simple  algorithm 
that  solves  this  equation  is  shown  in  Figure  19.  The  value  of  B  represents  no+An  and  A  represents  no-m-1 . 
C  is  used  to  calculate  how  many  times  A  goes  into  B,  because  A=B/A.  C  also  keeps  track  of  remainders. 
When  the  appropriate  number  of  inter-bump  range  clock  cycles  have  elapsed  (i.e.,  A),  another  bump  will  be 
generated. 


Figure  19:  Bump  Clock  Generator 


4.2.6  Prototype  Element  Fabrication 


A  test  chip  is  being  fabricated  to  verify  the  operation  of  several  of  the  basic  CCD  elements 
described  in  this  report.  Tests  of  these  elements  will  verify  the  functionality  and  performance  of  the 
elements.  This  test  chip  was  laid  out  according  to  the  design  rules  dictated  by  the  foundry  fabricating 
the  device.  These  design  rules,  which  dictate  such  layout  details  as  minimum  mask  feature  widths  and 


QOOTinq 


27 

Proprietary  SBIR  Data;  Q-DOT,  Inc.,  Colorado  Springs,  CO;  Contract  DAMD17-96-C-6037 


spacings,  have  a  major  impact  on  size,  and,  hence,  the  performance  of  the  elements.  It  is  therefore 
necessary  to  actually  lay  out  the  critical  parts  of  the  beamformer  to  make  realistic  predictions  of  the 
device  performance. 


A  plot  of  the  masks  for  the  test  chip  is  shown  in  Figure  20.  There  are  three  test  structures.  The 
first,  magnified  in  Figure  21,  is  the  crossbar  switch  described  previously,  with  separate  input  and  output 
structures.  The  clock  multiplexers  are  included.  Tests  of  this  element  will  show  dynamic  range,  crosstalk, 
and  any  switching  upsets. 
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Figure  21 :  Crossbar  Switch  Test  Element 


The  second  structure,  magnified  in  Figure  22,  is  a  six-loop,  complementary  delay  line  with  a 
maximum  programmable  delay  of  64.  This  is  the  dynamic  delay  range  of  the  fine  delay  identified  in  the 
system  requirements.  In  the  test  element,  a  single  input  structure  is  followed  by  a  charge  splitter  to  make 
the  two  inputs  to  the  complementary  delay.  Separate  output  buffers  are  provided.  Tests  of  this  element 
will  show  dynamic  range,  crosstalk,  switching  effects,  and  charge  transfer  efficiency  effects  of  the  delay 
line. 


The  third  structure,  the  quad  sampler  described  earlier,  is  shown  in  Figure  23.  A  single  input 
structure  and  output  buffer  are  provided.  Tests  will  show  dynamic  range,  and  will  identify  the  clock 
voltages  needed  for  correct  operation. 

4.3  Potential  Performance  Limiting  Factors 

As  the  beamformer  design  evolved,  several  factors  were  considered  which  appeared  capable  of 
limiting  beamformer  and/or  scanner  performance.  Three  such  factors  were  analyzed:  charge  transfer 
efficiency,  dynamic  range,  and  power  consumption.  The  following  subsections  address  each  of  the  factors 
individually. 
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Figure  22:  Complementary  Delay  Line  Test  Element 


4.3.1  Charge  Transfer  Efficiency  Effects 


Incomplete  charge  transfer  between  adjacent  CCD  gates  modifies  the  charge  packet  stream.  A 
small  fraction  of  each  packet  is  left  behind  as  charge  is  shifted  along  the  structure.  The  charge  left  behind 
combines  with  other  charge  packets  later  occupying  the  same  potential  well  under  the  CCD  gate.  This 
charge  sharing  is  characterized  by  a  numerical  fraction  of  the  charge  fed  forward,  and  is  called  charge 
transfer  efficiency  (CTE).  CTE  varies  dramatically  between  different  CCD  structures  and  processes. 
Buried-channel  CCDs  (like  the  ones  used  on  this  project)  have  by  far  the  best  CTE  because  the  charge¬ 
carrying  potential  well  lies  below  the  surface  of  the  silicon-oxide  interface  and  thereby  avoids  charge¬ 
trapping  surface  states.  Although  CTE  values  of  greater  than  0.999  are  expected  for  the  proposed  project, 
process  variations  or  increased  operating  frequencies  could  reduce  the  CTE  to  0.997.  A  worst  case  CTE 
of  0.995  will  be  used  for  investigating  its  effects  on  the  proposed  beamforming  system.  Although  it  is 
unlikely  that  charge  transfer  efficiency  will  be  this  poor,  it  is  useful  to  investigate  its  effect  on  imaging. 


A  charge  packet  in  a  realistic  CCD  is  a  combination  of  the  current  charge  packet  together  with  ever 
decreasing  contributions  from  those  packets  that  have  come  before.  The  N,h  packet,  for  instance,  has  the 
following  characteristic  after  being  fed  forward  once: 
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Figure  23:  Quad  Sampler  Test  Element 


VN((n+l)x)  =  (l-e)VN.1(nx)  +  eVN(nx), 

where  VN  is  the  voltage  under  the  Nth  gate,  1-e  is  the  CTE,  and  nx  is  the  clock  index.  The  (l-£)VN1(nx) 
term  represents  the  value  of  the  current  packet  just  fed  forward,  whereas  eVN(nx)  is  that  portion  of  the 
previous  packet  left  behind.  Taking  a  Z-transform  and  combining  the  e  -related  terms  into  a  dispersion 
factor  D(z),  we  have: 

VN(z)  =  [(1-e)/  ( 1  -ez'1)]^ 1  VN1(z)  =  D(z)  zW^z). 

For  M,  such  transfers  the  dispersion  factor  is  raised  to  a  power  M,  as  is  the  delay  factor  z"1: 


H(z)  =  [(l-e)/(l-ez1)]MzM=  DM(z)  zM. 


This  can  be  expanded  to: 


H(z )  =  X  (M+n  1  )e"(l  -  z)Mz-W+n) 
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which  is  just  a  filter  acting  on  the  input  data  stream.  This  filter  is  phase-linear  and  low-pass  in  nature  and 
rolls  off  quickly  for  very  large  values  of  M,  which  would  occur  in  very  long  CCD  structures. 

To  understand  CTE  effects  on  image  quality,  we  have  simulated  CTE  losses  within  the  delay 
structures  of  the  system  by  calculating  the  appropriate  filters  and  applying  them  to  the  sampled  ultrasound 
RF  signals.  Figure  24  shows  the  Fourier  Transform  of  a  single  channel  RF  pulse  for  the  wire  target 
phantom  (f0=  3.4  MHz,  fs/4  =  108/4  =  27  MHz).  The  noise  floor  is  visible  approximately  60  dB  down 
from  the  ultrasound  carrier  frequency  components.  Superimposed  on  this  graph  is  the  CTE  (=  0.995)  filter 
frequency  response  due  to  5 12  and  250  charge  transfers.  This  figure  shows  that  the  oversampling  imposed 
by  the  system  to  achieve  sufficient  delay  accuracy  also  allows  the  imaging  frequencies  to  lie  within  the 
passband  of  the  CTE  filter.  Some  attenuation  will  occur  for  elements  undergoing  large  delays  (e.g.,  512 
transfers).  Others,  however,  will  have  little,  if  any,  delay,  so  the  effects  of  CTE  attenuation  will  be 
minimal.  A  group  delay  will  also  be  imposed  by  the  CTE  filter  and  has  the  potential  to  disrupt  some  of  the 
dynamic  focusing  of  the  system.  Static  group  delays  (i.e.,  delays  due  to  large  steering  angles),  however, 
can  be  compensated  by  beamforming  software. 

Figures  25  and  26  present  70  dB  ultrasound  images  of  wires  in  a  water  tank  produced  by  the 
proposed  system.  Figure  25  does  not  take  into  account  charge  transfer  efficiency  effects,  whereas. 

Figure  26  has  a  CTE  of  0.995.  Very  little  difference  between  these  two  images  is  apparent.  Actually  the 
central  portions  of  these  two  images  are  almost  identical  because  the  number  of  delays  imposed  on  the 
signals  are  small  due  to  the  near-zero  steering  angle.  There  is  a  very  slight  reduction  in  the  peak  wire 
intensities  throughout  Figure  26,  due  to  CTE  filter  attenuation  as  well  as  group  delay  defocusing  effects 
which  were  not  corrected. 

Figures  27  and  28  present  50  dB  ultrasound  images  of  a  phantom  with  randomly  distributed 
scatterers  everywhere  except  within  four  cylindrical  anechoic  (i.e.,  cyst)  regions  visible  in  the  images. 

There  are  no  noticeable  differences  between  Figure  27,  which  does  not  account  for  CTE  effects,  and 
Figure  28  which  has  a  CTE  of  0.995  imposed. 

These  simulations  show  that  the  effects  of  charge  transfer  efficiency  within  the  CCD  delay  line 
(down  to  0.995)  have  minimal  effect  on  image  quality.  Attenuation  effects  of  CTE  are  minimized  due  to 
the  oversampling  ratio  of  the  system,  however,  some  attenuation  does  occur.  Processing  across  the  array 
also  reduces  CTE  effects  because  every  beam  sums  elements  with  large  and  small  delays  imposed, 
averaging  the  overall  attenuation.  Additionally,  the  group  delay  imposed  by  the  filter  apparently  does  not 
greatly  affect  the  focusing  ability  of  the  beamformer.  Even  if  it  did,  the  control  circuitry  of  the  beamformer 
could  easily  be  programmed  to  take  these  dispersion  effects  into  account. 

Expected  charge  transfer  efficiencies  for  this  project  range  from  0.999  to  0.9999.  These 
simulations  show  that  the  beamformer  could  tolerate  a  poor  0.995  CTE  without  noticeable  image 
degradation.  Charge  transfer  effects,  therefore,  are  not  considered  to  be  a  significant  technical  challenge 
for  this  project. 
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Figure  24:  Fourier  Transform  of  Wire  Target 


Figure  27:  Cyst  Phantom,  No  CTE  Effects  Figure  28:  Cyst  Phantom,  CTE  =  0.995 
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4.3.2  Dynamic  Range 


The  dimensions  of  the  gates  used  in  the  delay  lines  have  a  drawn  area  of  approximately  100  pm2. 
Fringing  and  the  impingement  of  field  oxide  into  the  channel  reduce  the  effective  area  to  75  pm2.  The 
buried-channel  implants  result  in  a  maximum  charge  density  of  3.3x10”  electrons/cm2  which  corresponds 
to  a  maximum  signal  swing  of  240,000  electrons  peak-peak,  or  85,000  electrons  RMS  for  a  sine  wave. 
The  effective  capacitance  of  the  input  signal  gate  is  2.8xl0'8  F/cm2,  which  yields  an  input  capacitance  of 
Cin=20  IF.  The  charge  is  sensed  on  an  output  diffusion  which  is  computed  to  have  a  capacitance  of  12  fF. 

The  noise  generated  in  the  CCD  input  circuit  is  expected  to  consist  of  the  so  called  "kTC"  noise, 
which  results  in  a  total  noise  charge  at  the  CCD  output  of  (kT(CiI1+C0Ut))1/2  =  72  electrons  RMS.  The 
output  buffer  degrades  the  noise  by  approximately  1  dB,  yielding  a  net  channel  dynamic  range  of  60.5  dB. 
The  processing  gain  that  results  from  summing  64  channels  adds  18  dB,  which  gives  the  overall  dynamic 
range  of  78.5  dB.  This  performance  meets  the  system  requirements  identified  in  Section  4.1. 


4.3.3  Power  Consumption 

The  power  needed  to  operate  the  CCDs  is  actually  consumed  in  the  clock  drivers.  When  each 
clock  line  is  pulled  positive,  the  charge  on  the  capacitance  of  the  line  flows  as  current  from  the  positive 
power  supply.  The  energy  stored  on  the  clock  line  capacitance  is  dissipated  when  the  voltage  is  pulled 
negative.  The  power  consumed  in  the  driver  is  given  by 

P=  1.5CV2f 

where  1 .5  is  the  driver  efficiency, 

C  is  the  clock  line  capacitance, 

V  is  the  change  in  voltage  (5V), 
and  f  is  the  average  frequency. 

Total  clock  line  capacitance  for  a  channel  consisting  of  a  four-phase  sampler,  and  a  512-stage 
complementary  delay  line  is  58  pF,  in  a  section  operated  at  a  maximum  of  80  MHz,  and  0.8  pF  in  a  section 
operated  at  320  MHz.  The  driver  power  will  be  185  mW.  Note  that  this  power  is  proportional  to  the 
sampling  rate,  which  is  related  to  the  detector  frequency.  For  the  minimum  specified  center  frequency, 

2.5  MHz,  the  power  will  be  reduced  to  46  mW. 

The  output  buffer  is  estimated  to  require  between  20  and  30  mW,  per  channel,  making  the  total 
channel  power  215  mW,  worst  case. 

If  the  four-phase  sampler  proposed  is  replaced  by  an  eight-phase  sampler,  the  maximum  CCD 
clock  rate  is  reduced  to  40  MHz,  and  the  delay  line  requires  only  256  stages.  In  this  case,  the  clock  line 
capacitance  becomes  32  pF  operated  at  a  maximum  of  40  MHz,  and  1 .4  pF  operated  at  320  MHz.  The 
maximum  driver  power  will  then  be  65  mW.  The  impact  on  other  aspects  of  the  beamformer  will  be 
thoroughly  assessed  before  making  this  change.  Other  techniques  for  reducing  power  will  be  considered 
in  Phase  II. 
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4.4  Phase  II  Development  Plan 

Q-DOT  proposes  to  provide  all  labor,  materials,  services,  and  communication  necessary  to  develop 
a  low-cost,  low-power,  high-performance  beamformer  based  on  CCD/CMOS  technology.  Matthew 
O'Donnell,  Professor  of  Electrical  and  Computer  Science  at  the  University  of  Michigan  and  a  world-class 
designer  of  medical  ultrasound  systems,  will  guide  Q-DOTs  beamformer  design,  drawing  on  his  extensive 
experience  in  related  systems.  Q-DOTs  plan  to  develop  the  proposed  beamformer  chip  is  diagrammed  in 
Figure  29,  the  Phase  II  Work  Plan. 

Q-DOT  also  proposes  an  Optional  program  to  demonstrate  the  beamformer  chip.  The  plan  for 
the  demonstration  Option  is  shown  in  Figure  30.  Details  of  the  two  work  plans  are  presented  in  the 
Appendix. 
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Task  1 


System  Study 


The  requirements  defined  under  Phase  I  will  be  fleshed  out  to  include  such  matters  as  the  exact 
functions  and  interfaces  to  the  beamformer  chip.  This  task  will  result  in  a  complete  specification  for  the 
beamformer  chip. 

Task  2  Beamformer  Chip  Design  I 

Given  the  specification  from  Task  1 ,  we  will  design  the  details  of  the  various  sections  of  the 
beamformer  chip.  The  apodizer,  which  was  not  investigated  in  Phase  I,  will  be  based  on  a  design 
developed  at  Q-DOT  on  another  program.  Both  digital  and  analog  simulations  will  be  performed  to  assure 
that  the  design  will  meet  the  specifications. 

Task  3  Layout  and  Verification  I 

The  actual  layout  of  the  beamformer  chip  will  be  done  in  Task  3.  We  will  verify  the  layout  using 
computer  aided  design  (CAD)  software.  Critical  functions  will  be  broken  out  into  separate  test  cells. 

Task  4  Fabrication  I 

The  chip  will  be  combined  on  a  mask  set  with  one  or  more  designs  from  other  projects  for 
fabrication  at  an  outside  foundry.  The  cost  sharing  from  this  multiproject  approach  will  reduce  the  cost 
to  this  particular  program. 

Task  5  Test  and  Evaluation  I 

The  chip  will  be  tested  against  the  specification  developed  in  Task  1.  Critical  test  elements  will  be 
tested  as  needed  to  isolate  any  problems  or  performance  issues. 


Task  6 

Beamformer  Chip  Design  II 

Task  7 

Layout  and  Verification  II 

Task  8 

Fabrication  II 

Task  9 

Test  and  Evaluation 

A  second  pass  at  the  chip  design,  layout,  fabrication,  and  test  is  planned  to  correct  any  problems 
with  the  first  pass  design.  We  consider  a  second  pass  to  be  necessary  for  an  analog  chip  of  this  complexity. 

Task  10  Test  Preparation 

A  custom  test  fixture  will  be  designed  and  built  under  this  task.  We  will  develop  custom  software 
to  control  the  tests,  and  manage  the  resulting  data. 

Task  1 1  Program  Management 

Program  management  continues  throughout  the  duration  of  the  project.  We  will  generate  a 
comprehensive  final  report  at  the  conclusion  of  this  task. 
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Optional  Demonstration  Project 


The  optional  demonstration  project  will  use  the  beamformer  chips,  with  an  ultrasound  probe, 
preamplifiers,  and  TGC  to  acquire  ultrasound  data  in  real  time.  The  stored  data  will  be  processed  off-line 
to  generate  several  sequential  frames  of  ultrasound  images  for  display.  The  demonstration  will  use  a 
transducer  probe  from  a  commercially  available  ultrasound  system  with  characteristics  similar  to  the 
proposed  system. 

Task  20  Demonstration  System  Definition 

The  requirements  on  the  hardware  for  the  demonstration  system  will  be  developed  under  this  task. 
We  will  design  and  build  the  analog  front-end  consisting  of  the  preamplifiers,  time  gain  control, 
transmit/receive  switches,  and  drivers. 

Task  21  Demonstration  System  Software 

At  the  start  of  this  task,  we  will  define  the  software  to  control  the  beamformer  in  real  time,  and  for 
off-line  video  processing.  We  will  develop  this  software  in  conjunction  with  the  University  of  Michigan. 

Task  22  Demonstration  System  Hardware 

The  frame  grabber  will  be  developed  under  subcontract  to  the  University  of  Michigan. 

Task  23  Demonstration  System  Integration 

Once  the  first-pass  beamformer  chips  are  evaluated,  we  will  combine  the  probe,  front-end,  control 
software  and  off-line  software  to  make  the  complete  demonstration  system.  We  are  assuming  that  the  first- 
pass  beamformer  chips  will  be  functional,  so  that  the  integration  and  debug  of  the  demonstration  system 
may  be  completed  by  the  time  the  final  chips  are  ready. 

Task  24  Demonstration 

The  system  will  be  demonstrated  for  our  sponsors  and  interested  industry  personnel. 

Task  25  Demonstration  Program  Management 

The  results  of  the  demonstration  project  will  be  summarized  in  a  comprehensive  final  report. 
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5.0  Conclusions 


It  is  feasible  to  realize  a  multichannel  beamformer  based  on  CCD  technology  for  a  portable 
ultrasound  system  having  performance  comparable  to  the  current  commercial  state  of  the  art.  Hie  power 
consumption  will  be  10%  of  conventional  digital  techniques,  so  battery  powered  operation  is  feasible. 
Dynamic  range  will  be  adequate  for  sensitive  color  flow  images  and  duplex  Doppler  measurements.  A 
major  limitation  in  the  use  of  CCD  delay  lines,  charge  transfer  efficiency,  will  have  negligible  effects  on 
image  quality.  A  chip  containing  critical  test  elements  is  being  fabricated. 
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Requirements  Definition 
SW  Design 


SW  Design  review 
SW  Implementation 
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Final  Report 


Figure  32:  Phase  II  Demonstration  Option  Work  Plan  Detail 
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REPLY  TO 
ATTENTION  OF: 


DEPARTMENT  OF  THE  ARMY 

US  ARMY  MEDICAL  RESEARCH  AND  MATERIEL  COMMAND 
504  SCOTT  STREET 

FORT  DETRICK,  MARYLAND  21702-5012 


MCMR-RMI-S  (70-ly) 


4  Dec  02 


MEMORANDUM  FOR  Administrator,  Defense  Technical  Information 
Center  (DTIC-OCA) ,  8725  John  J.  Kingman  Road,  Fort  Belvoir, 
VA  22060-6218 

SUBJECT:  Request  Change  in  Distribution  Statement 


1.  The  U.S.  Army  Medical  Research  and  Materiel  Command  has 
reexamined  the  need  for  the  limitation  assigned  to  technical 
reports  written  for  this  Command.  Request  the  limited 
distribution  statement  for  the  enclosed  accession  numbers  be 
changed  to  "Approved  for  public  release;  distribution  unlimited. 
These  reports  should  be  released  to  the  National  Technical 
Information  Service. 

2.  Point  of  contact  for  this  request  is  Ms.  Kristin  Morrow  at 
DSN  343-7327  or  by  e-mail  at  Kristin.Morrow@det.amedd.army.mil. 

FOR  THE  COMMANDER: 


Enel 


RINEHART 
Deputy  Chief  of  Staff  for 
Information  Management 
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